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ABSTRACT 

We  consider  the  following  problem.  Suppose  a  rooted  tree  T  is  available  for 
preprocessing.  Answer  on-line  queries  requesung  the  lowest  common  ancestor  for 
any  pair  of  vertices  in  T .  We  present  a  linear  time  and  space  preprocessing  algo- 
nthm  which  enables  us  to  answer  each  query  in  0(1)  time,  as  in  Harel  and  Taijan 
[HT-84].  Our  algorithm  has  the  advantage  of  being  simple  and  easily  paralleliz- 
able.  The  resulting  parallel  preprocessing  algorithm  runs  in  logarithmic  time  using 
an  optimal  number  of  processors  on  an  EREW  PRAM.  Each  query  is  then 
answered  in  (9  (1)  time  using  a  single  processor. 


t  The  research  of  both  auLhors  was  supported  by  ihe  Applied  MaLhcmatical  Sciences  subprogram  of  ihe  Office  of  Energy  Research. 

U.S.  Department  of  Energy  under  contract  number  DE-AC02-76ER03077. 

+  The  research  of  this  author  was  supported  by  NSF  grant  NSF-DCR-83  18874  and  ONR  grant  N0O014-85-K-CK)46. 


1.  Introduction 

We  consider  the  following  problem.  Given  a  rooted  tree  T{V  ,E)  for  preprocessing,  answer  on-line  LCA 
queries  of  the  form,  "Which  vertex  is  the  Lowest  Common  Ancestor  (LCA)  of  x  and  y  ?"  for  any  pair  of  ver- 
tices x,y  in  T.  (Let  us  denote  such  a  query  LCA  (;c,>' ).)  We  present  a  preprocessing  algorithm  which  runs  in 
linear  time  and  linear  space  on  the  serial  RAM  model.  (For  the  definition  of  a  RAM  model  see,  e.g.,  [AHU- 
74].)  Given  this  preprocessing  we  show  how  to  process  each  such  LCA  query  in  constant  dme. 

We  consider  also  parallelization  of  our  algorithm.  The  model  of  parallel  computation  used  is  the 
exclusive-read  exclusive-write  (EREW)  parallel  random  access  machine  (PRAM).  A  PRAM  employs  p  syn- 
chronous processors  all  having  access  to  a  common  memory.  An  EREW  PRAM  does  not  allow  simultaneous 
access  by  more  than  one  processor  to  the  same  memory  locadon  for  either  read  or  write  purposes.  See  [Vi-83] 
for  a  survey  of  results  concerning  PRAMs. 

Let  Seq{n )  be  the  fastest  known  worst-case  running  time  of  a  sequential  algorithm,  where  n  is  the  length 
of  the  input  for  the  problem  at  hand.  A  parallel  algorithm  that  runs  in  O  {Seq  {n  )lp  )  time  using  p  processors  is 
said  to  have  optimal  speed-up  or,  more  simply,  to  be  optimal.  A  primary  goal  in  parallel  computation  is  to 
design  optimal  algorithms  that  also  run  as  fast  as  possible. 

Our  preprocessing  algoritiim  is  easily  parallelized  to  obtain  an  optimal  parallel  preprocessing  algoridim 
which  runs  in  0{]Qgn)  time  using  n/logn  processors  on  an  EREW  PRAM,  where  n  is  the  number  of  vertices 
in  T.  Parallelizing  the  query  processing  is  straightforward  provided  read  conflicts  are  allowed:  Ic  queries  can 
be  processed  in  (9(1)  time  using  k  processors. 

In  their  extensive  paper  [HT-84],  Harel  and  Tarjan  gave  a  senal  algonthm  for  the  same  problem.  The 
perfoimance  of  their  algondim  is  the  same  as  ours.  However,  our  algontiim  has  two  advantages:  (1)  It  is  con- 
siderably simpler  in  both  the  preprocessing  stage  and  the  query  processing.  (2)  It  leads  to  a  simple  parallel 
algontiim.  Consider  a  dynamic  LCA  problem,  in  which  the  input  is  a  collection  of  trees  and  edges  can  be 
added  (or  perhaps  even  removed)  dynamically.  [HT-84]  gave  an  algontiim  for  some  special  case  of  tiiis  prob- 
lem. We  leave  it  open  whetiier  their  algoritiim  can  be  simplified  or  whetiier  more  general  versions  of  the 
dynamic  LCA  problem  can  be  eitiier  simplified  or  improved. 

Observe  tiiat  using  our  parallel  preprocessing  algoritiim  we  can  process  k  offline  LCA  quenes  in 
O(logn)  ume  using  {n+kyiogn  processors  provided  read  conflicts  are  allowed.  This  affects  the  performance 
of  parallel  algoritiims  for  tiiree  problems:  (1)  Given  an  undirected  graph  onent  its  edges  so  tiiat  the  resulting 
digraph  is  strongly  connected  (if  such  onentanon  is  possible)  [Vi-85].  (2)  Computing  an  open  ear  decomposi- 
tion and  jf -numbenng  of  a  biconnected  graph  [MSV-86].  Using  the  new  parallel  connectivity  and  list  ranking 
algontiims  of  [CV-86al  it  has  become  possible  to  solve  each  of  tiiese  problems  in  logantiimic  time  using  an 
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optimal  number  of  processors  only  when  m>n  logn ,  where  n  is  the  number  of  vemces  and  m  is  the  number 
of  edges  in  the  input  graph.  Our  off-line  LCA  computation  enables  extending  the  range  of  optimal  speed-up 
logarithmic  dme  parallel  algorithms  for  these  problems  to  sparser  graphs,  where  m  >na{n/n)  and  a  is 
inverse  Ackerman's  ftinction,  as  in  the  above  connectivity  algorithm.  (3)  Approximate  string  matching  [LV- 
86].  The  new  parallel  suffix  tree  construcdon  of  [LSV-86]  together  with  the  present  parallel  LCA  computation 
lead  to  a  considerable  simplification  of  the  parallel  algoridim  of  [LV-86].  This  simplification  has  already  been 
descnbed  in  [LSV-86]. 

The  paper  is  organized  as  follows.  Section  2  gives  a  high-level  description  of  the  algorithm.  Secdon  3 
describes  the  preprocessing  stage.  In  Secnon  4  we  show  how  to  process  LCA  queries  in  T  using  the  outcome 
of  the  preprocessing  stage.  Secnon  5  presents  parallelizanon  of  our  preprocessing  stage. 

2.  High-level  description 

The  whole  algonthm  is  based  on  the  following  two  observadons:  (1)  Had  our  input  tree  been  a  simple 
path,  it  would  have  been  possible  to  preprocess  it  (by  way  of  compudng  die  distance  of  each  vertex  from  the 
root,  as  explained  below)  and  later  answer  each  LCA  query  in  constant  dme.  (2)  Had  our  input  tree  been  a 
complete  binary  tree,  it  would  have  been  possible  to  preprocess  it  (by  way  of  computing  its  inorder  number- 
ing, as  explained  below)  and  later  answer  each  LCA  query  in  constant  dme. 

The  preprocessing  stage  assigns  a  number  INLABEL  (v  )  to  each  venex  v  in  7.  Motivated  by  observation 
(1),  these  numbers  satisfy  die  following  Path  Partition  property-.The  INLABEL  numbers  partition  the  tree  T 
into  padis,  called  INLABEL  paths.  Each  INLABEL  path  consists  of  die  vertices  which  have  the  same  INLA- 
BEL number. 

Let  B  be  die  smallest  complete  binary  tree  having  at  least  n  venices.  Our  description  identifies  each  vertex  in 
B  by  its  inorder  number.  Motivated  by  observation  (2),  the  INLABEL  numbers  satisfy  also  die  following 
Inorder  property:  The  INLABEL  numbers  map  each  vertex  v  in  7  mto  the  venex  INLABEL  (v  )  in  5  ,  such  diat 
the  descendants  of  v  are  mapped  into  descendants  of  INLABEL{v)  in  B  (v  is  considered  bodi  a  descendant 
and  an  ancestor  of  itself)- 

Section  4  descnbes  how  to  process  a  query  LCA{x,y)  for  any  pair  of  vertices  x ,y  in  T.  The  processing 
breaks  into  two  cases.  The  simpler  case  is  where  x  and  y  belong  to  the  same  INLABEL  path.  In  die  prepro- 
cessing stage  we  compute  for  each  venex  v  in  7  its  distance  from  die  root  into  LEVEL{v).  So,  LCA{x,y)  is 
simply  the  venex  among  x  and  y  which  is  closer  to  the  root.  The  more  complicated  case  is  where 
INLABEL  {x)x  INLABEL  (y).  First,  we  find  die  LCA  of  INLABEL(x)  and  INL\BEL{y)  in  die  complete 
binary  tree  B,  denoted  by  4> .  Let  z-LCA{x,y)  in  7.  Second,  we  find  INLABEL  {:).  INLABEL  {:)  is  die 
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lowest  ancestor  of  b  in  B  which  is  the  INLABEL  number  of  a  common  ancestor  of  j:  and  y  in  T.  Third,  we 
find  the  lowest  ancestor  of  x,  denoted  x,  and  the  lowest  ancestor  of  y ,  denoted  /,  in  the  path  defined  by 
INLABEL{z)  in  T .  The  second  and  diird  steps  use  two  more  results  of  the  preprocessmg  stage:  numbers 
ASCENDANT  {v ),  for  each  vertex  v ,  and  table  HE.AD .  Fourth,  z  is  simply  the  vertex  among  x  and  y  which  is 
closer  to  the  root. 

3.  The  Preprocessing  Stage 

The  outcome  of  die  preprocessing  stage  consists  of  labels  which  are  assigned  to  the  venices  of  T  and  a 
look-up  table,  called  HEAD.  The  label  of  each  venex  v  in  7  consists  of  diree  numbers:  INLABEL  (v), 
ASCENDANT {v )  and  LEVEL  (v ). 

We  start  with  computing  INLABEL(v  ),  for  each  vertex  v  in  7.  This  is  done  in  two  steps.  After  a  discus- 
sion of  these  two  steps  we  show  how  to  implement  them. 

Let  PREORDER  (v )  be  the  senal  number  of  v  in  preorder  traversal  of  T  and  SIZE(v )  be  the  number  of  ver- 
tices in  the  subtree  rooted  at  v  .   Definition  of  preorder  traversal  can  be  found,  e.g.,  in  [AHU-74],  pp.  54-55. 
Step  L  Compute  PREORDER  (v  )  and  SIZE  (v ). 

We  note  that  the  PREORDER  numbers  of  die  vertices  in  the  subtree  rooted  at  v  range  between 
PREORDER  (v)  and  PREORDER  {v)  + SIZE (v)-\,  and  therefore,  die  closed  inter/al 
[  PREORDER  (v ) ,  PREORDER  {v)  +  SIZE  (v )  -  1  ]  is  called  the  interval  of  v. 

In  Step  2  we  consider  the  binary  representation  of  the  (integer)  numbers  in  die  interval  of  v .  We  remark  that 
tiiroughout  this  paper  we  alternately  refer  to  numbers  and  to  their  binary  representations.  No  confusion  will 
arise. 

Step  2.  Find  the  (integer)  number  which  has  the  maximal  number  of  rightmost  "0"  bits  in  the  interval  of  v. 
This  number  is  assigned  to  INLABEL  (v  ). 

For  an  example  of  computations  described  in  this  section  see  Fig.  3.1. 

Discussion.  We  show  diat  die  INLABEL  numbers  satisfy  the  two  properties  defined  in  the  high-level  descnp- 
tion  of  the  previous  secnon.  Observe  that  the  mtervals  of  die  sons  of  v  must  be  pairwise  disjoint.  Therefore, 
INLABEL  (v )  belongs  to  the  mterval  of  at  most  one  son  of  v .  Denote  such  son  by  u  .  By  die  selection  of  die 
INLABEL  numbers  (Step  2),  INLABEL {u)  =  lNLABEL(v)  (if  u  exists),  and  for  any  other  son  w  of  v. 
INLABEL (w):^ INLABEL (v).  This  implies  the  Partition  Path  property  of  die  INLABEL  numbers.  Let  u  be 
any  descendant  of  v  in  7 .  Next,  we  show  diat  INLABEL  (u  )  is  a  descendant  of  INLABEL  ( v  )  in  die  complete 
binary  u^ee  B .  (Recall  that  our  descnption  identifies  each  vertex  in  B  by  its  inorder  number.)  Consider  two 
venices  b  and  c  in  S  .   We,  first,  give  a  necessary  and  sufficient  condition  for  c  to  be  a  descendant  of  ij  in  B 
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Fig.  3.1 
Example.  A  tree  with  four  numbers:  PREORDER.  INLABEL.  ASCENDANT  and  LEVEL  at  each  vertex. 


1         3       5        7       9        II       IJ      15      17      19       21      23     23     2/     29      31 


Fis.  3.2 
Example.  Inorder  numbenng  of  the  complete  binary  tree  with  3 1  veruces. 


Iog(^+l) 


-1  and  i  be  the 


and  then  show  that  INLABEL  (u )  and  INLABEL  (v )  satisfy  this  condidon.  Let  /  = 

number  of  rightmost  "0"  bits  in  b .  That  is,  b  consists  of  /-(  leftmost  bits  followed  by  a  single  "1"  and  i  "0"s. 
The  vertex  c  is  a  descendant  of  b  if  and  only  if  (I)  the  l-i  leftmost  bits  of  c  are  the  same  as  the  l-i  leftmost 
bits  of  b  (2)  the  number  of  rightmost  "0"  bits  in  c  is  at  most  i .  For  an  example  of  a  complete  binary  tree  and 
its  inorder  numbering  see  Fig.  3.2.  Let  i  be  the  number  of  rightmost  "0"  bits  in  !NLABEL{v).  Since 
INLABEL  (u )  belongs  to  the  interval  of  v  and  INLABEL  (v )  has  the  maximal  number  of  rightmost  "0"  bits  in 
this  interval,  the  number  of  rightmost  "0"  bits  in  INLABEL  {u )  must  be  at  most  i ,  and  the  l~i  leftmost  bits  in 
INLABEL  (u  )  must  be  the  same  as  the  l-i  leftmost  bits  in  INLABEL  (v ).  This  implies  that  INLABEL  {u  j  is  the 
descendant  of  INLABEL  (v  )  in  5  ,  and,  in  general,  the  Inorder  property  of  the  INLABEL  numbers. 

Implementation:  Step  (1)  is  implemented  in  linear  time  and  linear  space,  using  preorder  traversal  of  7.  Given 
PREORDER{v)  and  SIZE{v),  for  each  vertex  v  in  7,  Step  (2)  is  implemented  in  constant  time  per  venex  in 
[WO  substeps. 


Step  2.L  Compute 


into  i .  Let  us  explain  this. 


\og[iPREORDER  (v  )-l)  xor  {PREORDER  (v  )+SIZE  (v  )-l)] 

The  bitwise  logical  exclusive  OR  (denoted  xor)  of  PREORDER  (v)- I  and  PREORDER  {\>)  + SIZE  {v)-\ 
assigns  "1"  to  each  bit  in  which  PREORDER  {v)-  1  and  PREORDER  {v)  +  SIZE{v)- I  differ.  The  floor  of 
the  (base  two)  logarithm  gives  the  index  of  the  leftmost  bit  of  difference  (counting  from  the  rightmost  bit 
whose  index  is  0).  Note  that  the  bit  indexed  i  must  be  "0"  in  PREORDER  (v)- 1  and  "1"  in 
PREORDER  {v)  + SIZE  (v)-  1,  since  the  second  number  is  larger. 

Step  2.2  shows  how  to  "compose"  lNLABEL{v).  For  this,  we  need  two  observations:  (1)  The  /-(  +  1  leftmost 
bits  of  INLABEL  (v)  are  the  same  as  the  l-i  +  l  leftmost  bits  in  PREORDER  {v)  +  SIZE{v)- I.  (2)  The  i  odier 
bits  in  INLABEL  (v )  are  "0"s. 

PREORDER  ( V  }+SIZE  ( v  )- 1 
2' 

in  PREORDER{v)  +  SIZEiv)-\  to  the  l-i  +  \  leftmost  bits  in  INLABELiv)  and  "0"s  to  the  other  bits  of 
INLABEL  (v). 

Remark:  The  above  computanon  is  based  on  PREORDER  numbering  of  the  vertices  of  T .  This  numbering  has 
the  property  that  the  numbers  assigned  to  die  subtree  rooted  at  any  vertex  of  T  provide  a  consecutive  senes  of 
mtegers.  In  fact,  any  altemanve  numbenng  having  this  properry  (e.g.,  POSTORDER  ,  INORDER )  will  produce 
INLABEL  numbers  which  will  be  suitable  for  our  preprocessing  stage. 

We  proceed  to  the  computation  of  the  ASCENDANT  numbers.  The  general  idea  is  that  for  each  venex  v, 


Step  2.2.  Compute  2' 


into  INLABEL{v).  This  assigns  the  l-i  +  \  leftmost  bits 
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the  single  numher  ASCENDANT  (v)  will  record  the  IN  LABEL  numbers  of  all  the  ancestors  of  v  in  T.  We 
observe  that,  from  the  viewpoint  of  vertex  v  the  INLABEL  number  of  each  of  its  ancestors  can  be  fully  speci- 
fied by  the  index  of  its  rightmost  "1".  This  is,  since  the  bits  which  are  to  the  left  of  this  "1"  are  the  same  as 
their  respective  bits  in  INLABEL{v).  Like  the  INLABEL  numbers,  ASCENDANT{v)  is  also  an  (/-(-l)-bit 
number.  Denote  die  binary  representation  of  ASCENDANT  (v)  by  die  binary  sequence  /i,(v),...,^Q(v).  We  set 
A^(v)=\  only  if  (■  is  die  index  of  a  nghtmost  "I"  in  die  INLABEL  number  of  an  ancestor  of  v  in  T.  To  com- 
pute the  ASCENDANT  numbers,  we  scan  the  venices  of  T  from  its  root  r  down  to  its  leaves  (use,  for  instance, 
Breadth-First  Search).  We  stan  with  ASCENDANT {r)  =  2  .  Consider  an  internal  vertex  v  m  T  and  let  F(v) 
be  die  fadier  of  v  in  7.  If  lNLABEL{v)  =  INLABEL{F {v))  dien  we  assign  ASCENDANT (F (v))  to 
ASCENDANT{v),  odierwise,  we  assign  AS  CENDANT  {F{v))  + 2'  to  ASCENDANT{v),  where  i  is  die  index  of 
the  rightmost  "1"  in  INLABELiv).  It  can  be  easily  verified  diat  ;  is  given  by 
[og{INLABEL  (v )  -  [INLABEL  (v )  and  {INLABEL  (v  )-l)]),  where  and  denotes  bitwise  logical  AND. 

Recall  that  LEVEL{v ),  for  each  vertex  v  in  T,  is  die  distance,  counung  edges,  of  die  path  from  v  to  die 
root  r.  Computation  of  the  LEVEL  numbers  is  straightforward  and  can  be  done  using,  e.g.,  Breadth-First 
Search. 

Recall  that  Fig.  3. 1  gives  an  example  of  the  labels. 

We  conclude  by  describing  how  to  compute  the  table  HEAD.  HEAD{k)  contains  the  vertex  which  is 
closest  to  die  root  in  the  path  consisting  of  all  vertices  whose  INLABEL  number  is  k .  HEAD  {k)  is  sometimes 
called  the  head  of  the  INLABEL  path  k .  Computation  of  die  table  HEAD  is  trivial.  For  each  vertex  v  ,  such 
diat  INLABELiv):)!^  INLABEL  {F{v))  we  assign  v  to  HEAD  {INLABEL  (v)).  This,  again,  takes  linear  time  and 
linear  space. 

A  general  implementation  remark:  The  time  bounds  of  both  the  preprocessing  stage  and  the  query  processing 
depend  on  the  ability  to  perform  multiplication,  division,  powers  of  two,  bitwise  AND,  base  two  discrete  loga- 
rithm and  bitwise  exclusive  OR  in  constant  time.  If  these  operations  are  not  part  of  the  machine's  repenoire, 
look-up  tables  for  each  missing  operation  are  prepared  in  linear  time  and  linear  space  as  pan  of  the  preprocess- 
ing stage.  These  tables  will  be  used  to  perform  die  missing  operanons  in  (9(1)  operanons  which  are  m  the 
repertoire. 

We  finall/  note  die  two  points  in  which  our  algonthm  is  similar  to  [HT-84]:  (1)  The  basic  observations 
that  It  IS  possible  to  answer  LCA  quenes  in  simple  paths  and  complete  binary  trees  in  constant  time.  (2)  The 
idea  of  packing  information  regarding  several  vemces  (as  in  the  ASCENDANT  numbers)  into  a  smgle  number. 
However,  the  final  preprocessmg  stage  and  query  processing  are  different. 
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4.  Processing  LCA  queries 

In  this  section  we  show  how  to  answer  LCA  queries  using  the  outcome  of  the  preprocessing  stage. 

Consider  a  query  LCA  (x  ,y ),  for  any  pair  of  vertices  x  ,y  in  T.  (To  illustrate  the  presentation  the  reader  is 
referred  to  Fig.  3.1.)  There  are  two  cases: 

(Case  A)  INLABEL{x)  =  lNLABEL(y).  It  must  be  that  x  and  y  are  in  the  same  INLABEL  path.  We  conclude 
that  LCA  {x  ,y)\sx  if  LEVEL  {x )  <  LEVEL  {y )  and  y  otherwise. 
(Case  B)  INLABEL  {x  )^ INLABEL  {y ).  Let  z  be  LCA  {x  ,y ).  We  find  z  in  four  steps: 

Step  /.  Find  i> ,  the  LCA  of  INLABEL  (x )  and  INLABEL  (y  )  in  the  complete  binary  tree  B  ,  as  follows.  Let  :  be 
the  index  of  the  rightmost  "1"  in  b  .  Since  i>  is  a  common  ancestor  of /A'L4S£L(x)  and  INLABEL{y)  inS  ,  the 
/-('  leftmost  bits  in  INLABEL  (x )  and  INLABEL  (y )  must  be  die  same  as  these  bits  in  b .  Since  b  is  the  lowest 
common  ancestor  of  INLABEL  {x )  and  INLABEL  (y  ),  ;  must  be  the  minimum  index  such  that  the  /  -i  leftmost 
bits  in  INLABEL (x)  and  INLABELiy)  are  the  same.  Hence,  i  is  the  index  of  die  leftmost  bit  in  which 
INLABEL {x)  and  INLABEL (y)  differ,  and  b  consists  of  the  /-(  leftmost  bits  in  INLABEL (x)  (or 
"  INLABELiy))  followed  by  a  single  "1"  and  ('  "0"s. 
Step  2.  Find  INLABEL{z)  (where  z  is  LCA{xy)).  For  this  we  find  the  index  of  the  rightmost  "1"  in 
WZv^£L(z),  denoted  by  y.  Since  z  is  a  common  ancestor  of  x  and  y  in  7,  A  (j:)=  1  and  A.(y )=  1.  We 
observe  that  INLABEL{z)  is  the  lowest  ancestor  of  b  xn  B  which  is  also  the  INLABEL  number  of  a  common 
ancestor  of  j:  and  y  in  T.  Therefore,  the  index;  must  be  the  index  of  the  rightmost"!"  in  A;(x),...,A;(x)  and 
A,  ly),...A,{y ).  INLABEL  (z  )  consists  of  the  /  -j  leftmost  bits  of  INLABEL  {x )  (or  INLABEL  {y ))  followed  by  a 
single  "1"  and  7  "0"s. 

Step  3.  Find  x,  the  lowest  ancestor  of  j:  in  the  path  defined  by  INLABEL  (z  ).  Also,  find  y ,  the  lowest  ancestor 
of  y  in  this  same  path.  We  show  how  to  find  x .  y  is  found  similarly. 

\f  INLABEL(x)  =  INLABEL(z)  then  x'=x  and  nodiing  has  to  be  done.  Suppose  INLABEL{x)ii:  INLABEL  {z). 
We  set  the  following  intermediate  goal,  as  the  main  step  towards  finding  x:  Find  the  son  of  x  which  is  also  an 
ancestor  of  x.  Denote  the  vertex  that  we  search  by  w  and  let  -t  be  die  index  of  the  nghtmost  "1"  in 
!NLABEL(w).  It  is  not  difficult  to  venfy  that  k  is  the  mdex  of  die  leftmost  "1"  in  A  _,(.r ),...„4q(x).  So,  we 
find  k.  Clearly,  lNLABEL{w)  consists  of  die  l-k  leftmost  bits  of  WL4SEL  (.r  )  followed  by  a  single  "1"  and  k 
"0"s.  Obser/e  diat  w  is  die  head  of  its  INLABEL  padi  (since  die  INLABEL  number  of  its  father  x  is  different 
from  INLABEL  (w)).  Therefore,  w  is  HEAD  {INLABEL  (w))  and  our  mtermediate  goal  is  achieved.  Finally,  x 
is  the  fadier  of  w. 
Step4.  LCA{xy)\sx  \f  LEVEL  ix)<  LEVEL  (y)  :ind  y  odierwise. 


Step   2.2.     Compute    2' 


into    COMMON,.     COMMON,    lists    all    the    "l"s    in    both 


In  the  rest  of  this  section  we  give  additional  implementation  details  required  for  the  above  query  process- 
ing. 

Step  1.  To  find  i ,  the  index  of  the  rightmost "  1"  in  6  ,  we  compute  i  :=    \og[INLABEL  {x )  xor  INLABEL  (y )]   . 
This  is  similar  to  Step  2.1  in  the  INLABEL  numbers  computation  of  the  previous  section.  Given  i ,  b  can  be 
computed  similarly  to  Step  2.2  there. 
Step  2.  To  find  j  we  do  the  following: 

Step  2.L  Compute  the  bitwise  logical  AND  of  ASCENDANT  {x)  and  ASCENDANT  (y)  into  COMMON . 

COMMON  ' 
2' 

Aiix),...A,ix)a.ndAi(y),...A^iy). 

Step    2.3.     j    is    the    index    of    the    rightmost    "1"    in    COMMON,.     To    find    j    we    compute 

J  ■.=  \og{COMMON^  -[COMMON,  and  (COMMON,-!)]),  as  in  the  ASCENDANT  numbers  computation 

of  the  previous  section. 
The  implementation  of  Step  3  uses  the  same  techniques. 

5.  The  Parallel  Preprocessing  Algorithm 

In  this  section  we  describe  the  parallel  version  of  our  preprocessing  stage.  It  runs  in  O  (log/i )  rime  using 
n/logn  processors.  We  make  the  following  assumption  regarding  the  representation  of  the  input  tree  T.  Its 
n-l  edges  are  given  in  an  array,  where  the  incoming  edges  of  each  vertex  are  grouped  successively.  By  our 
definition  of  the  tree  T,  its  edges  are  directed  towards  the  root. 

Computing  the  labels  in  parallel.  To  compute  the  labels  of  the  vertices  in  T  we  apply  the  Euler  tour  technique 
for  computing  tree  functions,  which  was  given  in  [TV-85]  and  [Vi-85].  We  will  implement  it,  however,  using 
the  0{\ogn)  time  optimal  parallel  list  ranking  algorithm  of  [CV-86a].  This  list  ranking  algorithm  is  designed 
for  an  EREW  PRAM.  It  is  based  on  expander  graphs  and  its  0  (log/i )  time  bound  hides  a  constant  which  is 
not  very  small.  We  note  that  [CV-86b]  gave  recently  an  alternative  list  ranking  algorithm  with  the  same  time 
and  processor  efficiencies.  This  alternative  algorithm  is  designed  for  a  PRAM  which  allows  simultaneous 
access  to  the  same  memory  location  for  both  read  and  wnte  purposes  (called  CRCW  PRAM).  It  is  simpler  and 
its  O  (logn )  time  boun  '  requires  a  small  constant. 

Below,  we  first  recollect  the  construction  required  for  the  Euler  tour  technique.  We  then  show  how  to  use 
it  for  computing  the  labels.  The  only  reason  which  forced  us  to  present  anew  the  Euler  tour  technique  is  that 
the  computation  of  the  ASCENDANT  numbers  has  not  appeared  elsewhere. 

Step  I    For  each  edge  (v  — » u  )  in  7  we  add  its  anti-parallel  edge  (u  — ♦  v  ).  Let  H  denote  the  new  graph. 
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Since  the  in-degree  and  out-degree  of  each  vertex  in  H  are  the  same,  H  has  an  Euler  path  that  stans  and 
ends  in  r .  Step  2  computes  this  path  into  the  vector  of  pointers  D ,  where  for  each  edge  e  of  H ,  D{e)  will 
have  the  successor  edge  of  e  in  the  Euler  padi. 

Step  2.  For  each  vertex  v  of  //  we  do  the  following.  (Let  the  outgoing  edges  of  v  be  (v  — >  Ug),...,(v  — >  u^_^).) 
D(«;  ^  v)  :=(v  ^aj__^j,jjjjjj^),  for  i=0,...,d-\.  Now  D  has  an  Euler  circuit.  The  "correction" 
D  («j_i  — >  r )  :=  end-of  -list  (where  the  out-degree  of  r  is  li )  gives  an  Euler  path  wiiich  starts  and  ends  in  r . 

We  show  how  to  use  the  Euler  path  in  order  to  find  PREORDER{v),  PREORDER{v)  +  SIZE{v)-\  and 
LEVEL  (v )  for  each  vertex  v  in  7. 

Step  3.  We  assign  two  weights:  W,(e)  and  W2(e)  to  each  edge  e  in  the  Euler  path  as  follows.  (1)  VV,(e)=  1  if 
e  is  directed  from  r  (that  is,  if  e  is  not  a  tree  edge)  and  Wj(e)  =  0  otherwise.  (2)  W2(e)=  1  if  e  is  directed 
from  r  andW2(e)  =  -l  otherwise. 

Step  4.  We  apply  twice  an  opdmal  logarithmic  dme  parallel  list  ranking  algorithm  to  fmd  for  each  e  in  H  its 
(weighted)  distance  from  the  start  of  the  Euler  path;  The  first  application  is  relative  to  the  weights  W^  and  the 
result  is  stored  in  DISTANCE  ^{e ).  The  second  application  is  relative  to  the  weights  Wj  and  the  result  is  stored 
in  DISTANCE 2{.e).  Consider  a  vertex  v  ^r  and  let  u  be  its  fadier  in  T.  PREORDER(v)  is 
DISTANCE ^(u  ^v)+l,  PREORDER{v)  +  SIZE(v)-l  is  DISTANCE ^{v  ^u)+ 1,  and  LEVEL{v)  is 
DISTANCE 2(u  — >  v).  (These  claims  can  be  readily  verified  by  the  reader.) 

Step  5.  Given  PREORDER(v)  and  PREORDER{v)  +  SIZE(v)-l  for  each  vertex  v  in  7  we  compute 
INLABEL  (v )  in  constant  time  using  n  processors  as  in  the  serial  algorithm. 

Next,  we  show  how  to  use  the  Euler  path  in  order  to  find  ASCENDANT (v )  for  each  vertex  v  in  7. 

Step  6.  We  assign  a  (new)  weight  W  (e)  to  each  edge  e  in  the  Euler  path  as  follows.  For  each  vertex  v  ?;  r  we 
do  the  following.  Let  u  be  the  father  of  V  in  7  and  let  i  be  the  index  of  die  rightmost  "1"  in //VLA6£L  (v).  If 
INLABEL {v  )^ INLABEL (u  ),  we  assign  W{u  — >  v)  =  2'  and  W{v  —*  u  )  =  -2' .  The  weight  of  all  other  edges  is 
set  to  zero. 

Step  7 .  We  apply  again  a  parallel  list  ranking  algorithm  to  fmd  for  each  e  in  H  its  (weighted)  distance  from 
the  start  of  the  Euler  path.  Consider  a  vertex  v  *r  and  let  u  be  its  father  in  7.  ASCENDANT {v)  is  die  dis- 
tance of  the  edge  (u -» v )  plus  2'.   Clearly,  A5CE/VDAN7(r)  =  2'. 

We  note  that,  given  the  labels,  die  table  HEAD'  can  be  computed  in  constant  rime  using  n  processors. 

Complexity.  Each  of  steps  4  and  7  needs  n/iogn  processors  and  O(logn)  time.  Each  of  steps  1,2,3,5,6  and  the 
computation  of  HEAD  needs  n  processors  and  (9(1)  time  and  can  be  readily  simulated  by  n/\ogn  processors 
in  0  (logn  )  time.   Thus,  the  parallel  preprocessing  stage  can  be  done  in  a  total  of  O  (logn  )  time  using  n  /log" 


processors. 


Acknowledgements.  We  thank  Noga  Alon  and  Yael  Maon  for  stimulating  discussions. 

REFERENCES 

[AHU-74]  A.V.  Aho,  J.E.  Hopcroft  and  J.D.  Ullman,  The  Design  and  Analysis  of  Computer  Algorithms, 
Addison- Wesley,  Reading,  MA,  1974. 

[CV-86a]  R.  Cole  and  U.  Vishkin,  "Approximate  and  exact  parallel  scheduling  with  applications  to  list, 
tree  and  graph  problems",  Proc.  27th  Annual  Symp.  on  Foundations  of  Computer  Science, 
(1986),  pp.  478-491. 

[CV-86b]  R.  Cole  and  U.  Vishkin,  "Faster  opomal  parallel  prefix  sums  and  list  ranking",  TR  56/86,  the 
Moise  and  Fnda  Eskenasy  Institute  of  Computer  Science,  Tel  Aviv  University  (1986). 

[HT-84]  D.  Harel  and  R.E.  Taijan,  "Fast  algorithms  for  finding  nearest  common  ancestors",  SIAM  J. 

Comput.,  13  (1984),  pp.  338-355. 

[LV-86]  G.M.  Landau  and  U.  Vishkin,  "Introducing  efficient  parallelism  into  approximate  string  match- 

ing", Proc.  1 8th  ACM  Symposium  on  Theory  of  Computing,  1986,  pp.  220-230. 

[LSV-86]  G.M.  Landau,  B.  Schieber  and  U.  Vishkin,  "Parallel  construction  of  a  suffix  tree",  TR  53/86, 
the  Moise  and  Frida  Eskenasy  Insrimte  of  Computer  Science,  Tel  Aviv  University  (1986). 

[MSV-86]  Y.  Maon,  B.  Schieber  and  U.  Vishkin,  "Parallel  ear  decomposition  search  (EDS)  and  st- 
numbering  in  graphs",  To  appear  in  Theoretical  Computer  Science.  Also  in  Proc.  2nd  Aegean 
Workshop  on  Computing,  Lecture  Notes  in  Computer  Science  227,  Springer- Verlag  (1986),  pp. 
34-45. 

[TV-85]  R.E.  Taijan  and  U.  Vishkin,  "An  efficient  parallel  biconnectivity  algorithm",  SIAM  J.  Comput. 

14(1985),  pp.  862-874. 

[Vi-83]  U.  Vishkin,  "Synchronous  parallel  computation  -  a  survey",  TR-71,  Dept.  of  Computer  Science, 

Courant  Institute,  NYU,  (1983). 

[Vi-85]  U.  Vishkin,  "On  efficient  parallel  strong  orientation".  Information  Proc.  Letters  20  (1985),  pp. 

235-240. 


NYU  COMPSCI  TR-299      c  J 
Schieber,  Baruch 
On  finding  lowest  common 
ancestors 

NYU  COMPSCI  TR-299      c.l " 
Schieber,  Baruch 
On  finding  lowest  common 
ancestors 

1 

^  ,  ,      APR  1 3  1988 

This  book  may  be  kept 

FOURTEEN    DAYS 

A  fine  «ili  be  eha^.ed  Jor  each  day  the  boo.  is  kept  ove^ 


■f^'-'U- 


